mRAT-SQL+GAP: A Portuguese Text-to-SQL Transformer

نویسندگان

چکیده

The translation of natural language questions to SQL queries has attracted growing attention, in particular connection with transformers and similar models. A large number techniques are geared towards the English language; this work, we thus investigated when input given Portuguese language. To do so, properly adapted state-of-the-art tools resources. We changed RAT-SQL+GAP system by relying on a multilingual BART model (we report tests other models), produced translated version Spider dataset. Our experiments expose interesting phenomena that arise non-English languages targeted; particular, it is better train original training datasets together, even if single target desired. This fine-tuned double-size dataset (English Portuguese) achieved 83% baseline, making inferences for test investigation can help researchers produce results Machine Learning different from English. ready data available, open-sourced as mRAT-SQL+GAP at: https://github.com/C4AI/gap-text2sql.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DIXI - portuguese text-to-speech system

This paper describes the software architecture of the Portuguese text-to-speech system DIXI. The system has three major modules. The rst one contains the text normalizer and searches each word in the lexicon. The second one is a multi-level rule based module for lexical stress assignment, orthographic to phonetic transcription, metrically based prosodic patterning and for generating the evoluti...

متن کامل

Applied Phonetics: Portuguese Text-to-Speech

This paper describes a text-to-speech application for a variety of Brazilian Portuguese. After presenting the language’s phonetic attributes, the orthographic system is examined and shown to be a function that maps letters to these sounds. Given the orthography’s phonological regularity, it is simple to implement the textual analysis portion of a speech synthesis system, as I demonstrate with s...

متن کامل

SQL-Grundlagen spielend lernen mit dem Text-Adventure SQL Island

Wir präsentieren SQL Island, ein neuartiges browserbasiertes Lernspiel, welches auf dem Konzept der Text-Adventures basiert. Nach einem Flugzeugabsturz landet die Spielfigur auf einer Insel. Man redet mit Bewohnern, sammelt Gegenstände und käpft gegen Bösewichte. Die Besonderheit bei diesem Spiel ist jedoch, dass der Spieler seine Figur lediglich mittels SQL-Befehlen steuert. Alle nötigen Befeh...

متن کامل

From TUNA Attribute Sets to Portuguese Text: a First Report

This document describes the development of a surface realisation component for the Portuguese language that takes advantage of the data and evaluation tools provided by the REG-2008 team. At this initial stage, our work uses simple n-gram statistics to produce descriptions in the Furniture domain, with little or no linguistic variation. Preliminary results suggest that, unlike the generation of...

متن کامل

A Rule - Based Text - to - Speech System for Portuguese

This paper describes the latest progress in the development of a text-to-speech system for Portuguese. The system comprises 4 major modules: text normalization, linguistic and phonetic processing, generation of the synthesizer parameters and synthesis. The present rule-based version, based on the Klatt80 formant synthesizer, has achieved promising results, namely in what concerns the performanc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-91699-2_35